IMC-Denoise: a content aware denoising pipeline to enhance Imaging Mass Cytometry

Imaging Mass Cytometry (IMC) is an emerging multiplexed imaging technology for analyzing complex microenvironments using more than 40 molecularly-specific channels. However, this modality has unique data processing requirements, particularly for patient tissue specimens where signal-to-noise ratios for markers can be low, despite optimization, and pixel intensity artifacts can deteriorate image quality and downstream analysis. Here we demonstrate an automated content-aware pipeline, IMC-Denoise, to restore IMC images deploying a differential intensity map-based restoration (DIMR) algorithm for removing hot pixels and a self-supervised deep learning algorithm for shot noise image filtering (DeepSNiF). IMC-Denoise outperforms existing methods for adaptive hot pixel and background noise removal, with significant image quality improvement in modeled data and datasets from multiple pathologies. This includes in technically challenging human bone marrow; we achieve noise level reduction of 87% for a 5.6-fold higher contrast-to-noise ratio, and more accurate background noise removal with approximately 2 × improved F1 score. Our approach enhances manual gating and automated phenotyping with cell-scale downstream analyses. Verified by manual annotations, spatial and density analysis for targeted cell groups reveal subtle but significant differences of cell populations in diseased bone marrow. We anticipate that IMC-Denoise will provide similar benefits across mass cytometric applications to more deeply characterize complex tissue microenvironments.

: segmentation based on thresholding. Reference 21 suggests to find a threshold for every marker and every image individually to identify background signal. It looks like in the example shown here (DIMR_thresh) this manual threshold was not chosen at the best possible intensity level, including a lot of background. This gives an unfair advantage to other methods presented (and unfair discredit to the ref 21 Ijsselsteijn paper).
Supp figs 27/28 & fig 2e-h: compares single cell profiles from the different noise removal approaches, all using the cell mask from DeepSNF. I appreciate that it is preferred to minimise variability and stick with one cell mask. But, is DeepSNF not always favoured in the results if the cell mask used was based on DeepSNF? (Taken to the extreme: if the raw data projects circles, but DeepSNF gives only triangles as output, then a triangular mask will favour DeepSNF in every downstream analysis). Perhaps the reciprocal experiment might be done as well: comparing raw, DIMR and DeepSNF using a single cell mask based on raw or DIMR data? It is not clear what cell mask was used to produce figure 3. Again only DeepSNF, or each their own? Figure 3c shows how cell annotation when using either DIMR or DeepSNV: These figures are very difficult to interpret, a higher magnification would help. Figure 3d is a better example. Figure 3d and e: When looking at the image in close-up, it becomes apparent that the data in DeepSNF looks very far from the original data. Much more seems to have changed than just filtering out background noise. Negative pixels seem filled in as positive, the image looks "enhanced". When calculating the positive marker percentage in figure 3d, is that using these "enhanced" values, and how real are these? If the signal per cell has been enhanced, then it is of no surprise that the percentage positive marker per cell type is higher.
And following from that: Is the mean signal per individual cell still the same and is the signal still linear? Can the DeepSNV data be used to say something about expression levels between cells? Overall assessment: I think the results look beautiful and the clean-up is impressive, but I have some doubt whether the comparisons made are often favouring the DeepSNV outcome. And besides showing the improvement, I'd like to also see that the signal is still true to the original, not just a neural network's interpretation. Can the DeepSNV data faithfully be used to say something about expression levels between cells? Or has the underlying data been altered too much?
Finally, IMC analysis is very complicated and this is a barrier for many users. A simple all-round algorithm to improve noise reduction would certainly contribute to the field. While IMC-denoise potentially improves the quality of the data significantly, it has not been made clear how others might make use of this technology. How simple would it be to run for a new user? Are there any input variables that need to be optimised? How much programming skill would be required? I miss details on implementation, especially as the authors end with: "We expect IMC-Denoise to become a widely used pipeline in IMC analysis due to its adaptability, effectiveness and flexibility". A schematic figure describing the workflow for a user of IMC-Denoise would be helpful.

Minor comments:
DeepSNF as a name for the algorithm is not original and may cause confusion. SNF has been used as abbreviation for Similarity Network Fusion and based on this there is even a paper describing DeepSNF: Luciano & Hamza, 2019 https://doi.org/10.1007/s00371-019-01668-9 The introduction mentions decalcification required bone marrow samples, but the methods section does not mention whether this was applied to the tissues analysed here.
The results section compares many different alternative methods for filtering noise. The many abbreviations for these methods makes the text quite difficult to read for those that are not familiar with these methods.
The last paragraph discussing fig 2b and 2c should be revised, due to repetition and some inconsistency (p10).
Labelling of figure 4a, should CD20 be green in the left panels?
Reviewer #2: Remarks to the Author: What are the noteworthy results? -Lu et al. describe a novel denoising pipeline for highly multiplexed IMC images which involves two steps: differential intensity map-based restoration (DIMR) and a self-supervised deep learning algorithm for filtering shot noise (DeepSNF). Lu et al. demonstrate the use of this pipeline for enhancing the cell intensities of 12 markers for 96232 cells that they claim resulted in analysis similar to manual annotation of each dataset Will the work be of significance to the field and related fields? How does it compare to the established literature? If the work is not original, please provide relevant references. -This work is relevant to all fields which make use of histopathology and immunohistochemistry techniques, including but not limited to cancer and autoimmunity.
-Current methods of removing image noise are imperfect and improvements are needed.
-IMC allows to visualize and extract data from >50 analytes on a single tissue section. This is of value because the interpretation of some analytes is only possible in the context of other analytes; because some samples are rare and irreplaceable, because it can be difficult to visualize low density antigens against background using standard immunohistochemistry methods such as immunofluorescence.
Does the work support the conclusions and claims, or is additional evidence needed? -Yes the work supports the claims and conclusions. Denoising with DIMR and DeepSNF appear to reduce background to a comparable degree or in some cases better than other algorithms used in the literature, but also appears to reduce resolution visually. -Fig 1f: some markers seem to disappear after DeepSNF denoising (ie CD31, CD20). Is this biologically true? Were there other analytes tested in the same tissue to corroborate the results (ie CD19 for B cells?) - Fig 2h is showing that CD20+ gated cells express some amount of MPO. Is this a biological truth or is it an artifact of the image segmentation? Same with CD3+ and CD20+ gated populations having some expression of CD11b Are there any flaws in the data analysis, interpretation and conclusions? -Do these prohibit publication or require revision? -With the disclosure that I am not an expert on computer algorithms and bioinformatic analysis, it does seem that the data analysis and interpretation support the conclusion, with the caveats stated above Is the methodology sound? Does the work meet the expected standards in your field? -Yes, the initial segmentation analysis is based on a well-established pipeline published by the Bodenmiller group Is there enough detail provided in the methods for the work to be reproduced? -Yes Additional comments/questions: Authors are encouraged to discuss the questions posed below: -What is the maximum amount of "noise" before a channel is deemed unable to denoise? Is there an empirical way to determine this? -Have analytes that have low signal-to-noise ratio been tested? -How would the pipeline adapt to analytes that should visualize targets localized in-between nucleated cells (ie. synapses)? Denoising these channels often results in reduction of image intensity in areas where a signal is expected.
-How would the software handle instances where cell signal is unexpected; ie if a novel cell type is discovered, how would they distinguish between true positive and false positive? Has this been tested on tissues where there is a mix of morphologically heterogenous cells? (ie brain tissue where there are a mix of glial cells, immune cells, and fibroblasts). What happens when there are processes extending from glial cells that are present in the section but their corresponding nuclei are not in the same plane?
Reviewer #3: Remarks to the Author: This study develops IMC-Denoise, a denoising automated pipeline to enhance Imaging Mass Cytometry (IMC) images. Two primary sources of noise are here addressed: hot pixels and shot noise. To remove hot pixels, IMC implements a differential intensity map-based restoration (DIMR); to filter shot noise, it implements a deep learning method, DeepSNF. The approach presented here is of intrinsic interest as it develops some new algorithms and approaches that are illuminating and useful. The extensive benchmarking and comparisons with existing denoising methods are strengths of the study. The end-to-end analysis, including the impact on automated cell phenotyping, is important and further strengthens the study. Some concerns that should be addressed to improve the interpretability and, hopefully, the utility of the study are listed below.
1. How many iterations of DIMR does it take to adequately remove the hot pixels in data? Some quantification of the variability would be very helpful. (The paper indicates n = 4 and iteration number is set at 3.) How computationally intensive is the approach? 2. The definition of the threshold point x_T (top of page 6) as the value of x at which the second derivative, evaluated at x-dx, of the fitted curve from the kernel density estimation algorithm is greater than or equal to 0 (while the second derivative evaluated at x is less than or equal to zero) is mathematically imprecise and problematic. The "x-dx" bit is problematic. Please clarify this definition.
3. In the expression for the loss function of the DeepSNP model: The pixel mask is missing from equations 2 (and equation 31) in the regularization term (with coefficient given by the Hessian norm regularizer).
4. It would be useful for the authors to provide a justification for the specific choice of neural network architecture. The deep learning methodology is given in very general terms and is somewhat of a blackbox.
The manuscript can also benefit from thorough language editing.

Response to Reviewers:
Overview: The revised manuscript and supplementary materials have been uploaded for further review. The detailed responses to the reviewer comments are below, in the order received. We have enumerated and formatted our responses (in italicized blue text) to ensure that it can be clearly followed. Data and figure elements are enclosed throughout, in order to assist the Reviewers to follow the rationale for our responses and assess improvements. Where these changes have been made in the main manuscript and the supplemental files are listed.
Before the point-by-point response, we have a few general notes to the Reviewers: -Per a comment from Reviewer 1, we have renamed DeepSNF to DeepSNiF to avoid confusion with other algorithms. All of the references to the algorithm in this response letter and throughout the manuscript have been amended. -The initial submission of the manuscript included rigorous evaluation and comparison with both conventional and state of the art approaches to denoising and classification tasks in highly multiplexed single cell data. In the resubmission, we further expand the scope of these comparisons and the basis for our parameter choices. These additional experiments, performed on both synthetic and real-world data, underline the utility and wide applicability of DeepSNiF and IMC-Denoise. -We include amended figures directly in this response letter to ease the Reviewer evaluation of our responses.
Reviewer #1: R1.0: IMC-Denoise is a new approach that addresses a common problem with Imaging Mass Cytometry data: random hot pixels and shot noise background. Most commonly used analysis pipelines only use fairly basic filtering strategies, and usually accept a significant residual amount of these two noise factors. A simple all-round algorithm to improve noise reduction would certainly contribute to the field. While I do not have the expertise to assess the soundness of the mathematical formulations or algorithms used, the presented images look very clean and sharp. However, I have a few concerns related to the fairness of the comparisons made prove superiority to other methods in some of the figures, that I have set out below. Figure 2a: segmentation based on thresholding. Reference 21 suggests to find a threshold for every marker and every image individually to identify background signal. It looks like in the example shown here (DIMR_thresh) this manual threshold was not chosen at the best possible intensity level, including a lot of background. This gives an unfair advantage to other methods presented (and unfair discredit to the ref 21 Ijsselsteijn paper).

R1.1:
We certainly were not intending to draw any unfair comparisons in our work. As an example, when comparing DIMR to NTHM and MTHM hot pixel removal methods, we performed this on the basis of optimal parameters for these other methods. To the present comment regarding thresholding, we have undertaken further evaluation to provide a comprehensive and fair comparison. In order to ensure that manual thresholds chosen are optimal for the comparisons in benchmarking, we include multiple thresholds for all the CD34 and Collagen III images used in Figure  For DeepSNiF-processed images, we also selected multiple thresholds from 0.9 to 1.2 because the pixel values are continuous. Note that in this case, a single threshold was selected for all the images per marker for DeepSNiF-processed images, because we find this has already achieved good background removal performance (an advantage of our method). We have added the following figure  Using these more thorough comparisons, Fig. 2a-c have been also revised. We now use representative panels of optimally selected threshold values. In addition, we have made some edits to clarify this modification on Line 247-251 on Page 10 of the article file (see redline version). To summarize: The conclusions from these updated results remain consistent with the initial manuscript; specifically that DeepSNiF provides a means to generate background removal and denoising with minimal user-variable interaction. Manual threshold value choice has an impact on the results of the comparison with existing methods, but the magnitudes of these changes are small; DeepSNiF-based methods achieve best-in-class results even though the optimal thresholds were selected for DIMR_thresh.

R1.2:
Supp figs 27/28 & fig 2e-h: compares single cell profiles from the different noise removal approaches, all using the cell mask from DeepSNF. I appreciate that it is preferred to minimise variability and stick with one cell mask. But, is DeepSNF not always favoured in the results if the cell mask used was based on DeepSNF? (Taken to the extreme: if the raw data projects circles, but DeepSNF gives only triangles as output, then a triangular mask will favour DeepSNF in every downstream analysis). Perhaps the reciprocal experiment might be done as well: comparing raw, DIMR and DeepSNF using a single cell mask based on raw or DIMR data?
The Reviewer is correct -we had used a single mask as a means to provide a direct comparison across methods. We thank the reviewer for the suggestion of evaluating multiple masks to further assess performance of existing approaches and our novel method -and have performed the Reviewer suggested experiment.
To do so, we have extracted the raw, DIMR, DIMR_Ilastik and DeepSNiF single cell data from the segmentation masks of DIMR images. We have added the single cell data profile comparisons as new Supplementary Fig. 33 Figure 3c shows how cell annotation when using either DIMR or DeepSNV: These figures are very difficult to interpret, a higher magnification would help. Figure 3d is a better example.

R1.4:
We thank the Reviewer for this comment. Demonstrating the impact of our approach to improve the image quality and quantitation is a challenge, and we appreciate the suggestion to reduce the field of view. We have revised Fig. 3c with higher magnification images. We have also provided a new Supplementary Fig. 36 on Page 57 of the Supplementary Materials. New Fig. 3c Comparisons of DIMR and DeepSNiF-processed IMC images labeled with different cell markers, and the corresponding cell annotation results. The sub-panels (i)-(iv) in (c) correspond to the white dashed box region selection in their first panels, respectively. The white contours represent the differential phenotyping results between DIMR and DeepSNiF. Supplementary Fig. 36 Comparisons of DIMR and DeepSNiFprocessed IMC images labeled with different cell markers, and the corresponding cell annotation results with the DeepSNiFbased cell seg-mentaton masks (Fig. 3(c) Figure 3d and e: When looking at the image in close-up, it becomes apparent that the data in DeepSNF looks very far from the original data. Much more seems to have changed than just filtering out background noise. Negative pixels seem filled in as positive, the image looks "enhanced". When calculating the positive marker percentage in figure 3d, is that using these "enhanced" values, and how real are these? If the signal per cell has been enhanced, then it is of no surprise that the percentage positive marker per cell type is higher.

R1.5:
The Reviewer asks why the data from the denoised images looks different. First, we will mention that in response to the comment above (R1.4), we have revised this figure to show the images in greater magnification to facilitate easier interpretation by the reader. Additionally, we have modified the scale of circle diameter in 3E to more clearly demonstrate the relative positive marker change. Figure 3D shows the DIMR and DeepSNiF-processed IMC images of bone marrow with CD3, CD4 and CD8a specific markers, and the corresponding binary cell annotation results. In the next panel, Figure 3E illustrates the positive marker percentage across 12 select immune targets, and relative sensitivity of improvements by DeepSNiF. The Reviewer notes that the images (Fig.3D) between DIMR (hot pixel removed) and DeepSNiF-denoised data appear different. This is correct. The reason that the data in DeepSNiF appear different is that the original data in CD3, CD4 and CD8a channels are extremely noisy. These are widely recognized in the field as difficult-to-stain markers because of background signal; these effects are only increased in the complex and further difficult-to-stain marrow. With DeepSNiF, a high amount of background noise has been filtered. Note that there exists noise in signal as well, under these low SNR conditions. This could be the features to which the Reviewer remarks as looking like noise from negative pixels.
In the further two right panels of Fig. 3C,

The denoising effects on the image are not a result of efforts to "enhance" our image pixel values nor single-cell signal. Several analyses were included in the original manuscript, and expanded here, to prove this statement. First, we have calculated peak SNR (PSNR) and structural similarity (SSIM) in our simulation experiments (revised Supplementary Figs. 8-11 from Page 28 to 31 of the Supplementary Materials). These results demonstrate that the PSNR and SSIM improve after denoising, which means the pixel values after denoising are closer to the ground truth values. Second, we have conducted line fitting for DIMR and DeepSNiFprocessed single cell data (revised Supplementary Figs. 32a and 33b on Page 53 to 54 of the Supplementary Materials and shown below). This analysis shows that the range of DIMR and
DeepSNiF single-cell data are highly similar (correlated), no matter which cell segmentation masks are used. The slopes of the fitted line are almost 1 for all the markers. Across all the markers, the slope of this correlation is nominally 1, except for the CD20 marker. The reason for the exclusive result from CD20 is because of the very high degree of noise in this channel. The large amount of low intensity noise is almost filtered towards 0, which may bias the line fitting. Nevertheless, the ranges of the DIMR and DeepSNiF-processed data are very close. To summarize, we have not "enhanced" our data. Rather, the higher percentage positive marker results from the decreased false positive due to improved image and data quality following denoising.

R1.6:
And following from that: Is the mean signal per individual cell still the same and is the signal still linear? Can the DeepSNV data be used to say something about expression levels between cells? Fig. 32b on Page 53 and Supplementary Note 1.2.1 on Page 4 of the Supplementary Materials, lower signal is associated with higher shot noise. The implication for this is that more low-quality data will be restored by DeepSNiF.

R1.8:
Overall assessment: I think the results look beautiful and the clean-up is impressive, but I have some doubt whether the comparisons made are often favouring the DeepSNV outcome. And besides showing the improvement, I'd like to also see that the signal is still true to the original, not just a neural network's interpretation. Can the DeepSNV data faithfully be used to say something about expression levels between cells? Or has the underlying data been altered too much?
We thank the Reviewer for their positive interpretation of this work. In particular we are encouraged by the Reviewer comment about image quality -something that is hard to measure with real-world data -but certainly improves our ability to assess and navigate these complex datasets. Further, we have endeavored to show that there are objective improvements in the image and single-cell data quality following denoising, and that these results exceed conventional approaches through comprehensive comparison.
Through clarification and additional experimentation and analysis, we have further validated that we have not changed the scale and linearity of the single-cell data. The DeepSNiF algorithm filtered the noise in the image so that image quality and downstream analysis results have been improved. We also included comparisons to paired immunofluorescence images for additional reassurance that DeepSNiF is true to the original image signal (Supp. Fig. 23).

R1.9:
Finally, IMC analysis is very complicated and this is a barrier for many users. A simple allround algorithm to improve noise reduction would certainly contribute to the field. While IMCdenoise potentially improves the quality of the data significantly, it has not been made clear how others might make use of this technology. How simple would it be to run for a new user? Are there any input variables that need to be optimised? How much programming skill would be required? I miss details on implementation, especially as the authors end with: "We expect IMC-Denoise to become a widely used pipeline in IMC analysis due to its adaptability, effectiveness and flexibility". A schematic figure describing the workflow for a user of IMC-Denoise would be helpful.
We thank the Reviewer for this comment. Previously, we had uploaded our software package on our GitHub page (https://github.com/PENGLU-WashU/IMC_Denoise). We have provided a detailed tutorial file that covers the installation and implementation of the software package. In order to ensure that the community is able to access and implement our approach and findings, we have added to this revision a schematic figure (revised Supplementary Fig. 12) and described this figure in new Supplementary Note 4 (Page 32-33 on the Supplementary Materials) that describes each individual step of the process. We have provided multiple tutorials in the Jupyter Notebook for those users with limited programming experiences. For those with some coding skills, scripts files with parameter descriptions have also been provided. We hope our tutorials and descriptions can be used to help users use our software package more easily. The GitHub page is kept updated, and users can post their questions on the page, so that we can help them in their specific cases.
Minor comments: R1.10: DeepSNF as a name for the algorithm is not original and may cause confusion. SNF has been used as abbreviation for Similarity Network Fusion and based on this there is even a paper describing DeepSNF: Luciano & Hamza, 2019 https://doi.org/10.1007/s00371-019-01668-9 We were not aware, and thank the Reviewer for this information. We have changed DeepSNF to DeepSNiF (Deep learning-based Shot Noise image Filtering) in the manuscript, figures, supplemental materials, and software package. Further we have confirmed that there are no other algorithms with the same name.

R1.11:
The introduction mentions decalcification required bone marrow samples, but the methods section does not mention whether this was applied to the tissues analysed here.
The decalcification step was applied to our tissues. We have added the descriptions from Line 512 to 513 in the article file (redline version).

R1.12:
The results section compares many different alternative methods for filtering noise. The many abbreviations for these methods make the text quite difficult to read for those that are not familiar with these methods.
We appreciate this feedback. We have added a summary table for these algorithms in Methods section on Line 601. Furthermore, the detailed descriptions of these algorithms have been discussed in Supplementary Note 2.

R1.13:
The last paragraph discussing fig 2b and 2c should be revised, due to repetition and some inconsistency (p10).
The text has been revised in the results and discussion for Fig 2. R1.14: Labelling of figure 4a, should CD20 be green in the left panels?
There are multiple colors used in Fig. 4a. Therefore, color crosstalk is easily generated. To make the figure clearer, we label CD20 with gray in the revised figure. As a result -This work is relevant to all fields which make use of histopathology and immunohistochemistry techniques, including but not limited to cancer and autoimmunity.
-Current methods of removing image noise are imperfect and improvements are needed.
-IMC allows to visualize and extract data from >50 analytes on a single tissue section. This is of value because the interpretation of some analytes is only possible in the context of other analytes; because some samples are rare and irreplaceable, because it can be difficult to visualize low density antigens against background using standard immunohistochemistry methods such as immunofluorescence.
We thank the Reviewer for their close reading of the work, and appreciate their comments to improve this manuscript.

R2.2:
Does the work support the conclusions and claims, or is additional evidence needed? -Yes the work supports the claims and conclusions. Denoising with DIMR and DeepSNF appear to reduce background to a comparable degree or in some cases better than other algorithms used in the literature, but also appears to reduce resolution visually. 1. -Fig 1f: some markers seem to disappear after DeepSNF denoising (ie CD31, CD20). Is this biologically true? Were there other analytes tested in the same tissue to corroborate the results (ie CD19 for B cells?) The Reviewer appears to be asking why some signal is different between the DIMRprocessed and the DeepSNiF process portion of the images in Fig.1f. First, we will describe the results in Fig. 1f: here we have placed DIMR and DeepSNiF-processed portions of images directly next to each other for better visual inspection. The lower left half of each image corresponds to DIMR processing of the lower left image region, while the upper right is DeepSNiF processing of the upper right image region. For side-by-side comparison of DIMR and DeepSNiF-processed images to the identical regions on the raw images, please refer to Supp Fig 17, which shows the full field of view of the raw image for each channel shown in Fig.1F.
Normally, the unspecific staining signal should be lower than true signal. However, because of shot noise, some pixels may seem brighter (Supplementary Note 1.2

.1 on Page 4 of the Supplementary Materials). The disappearing signal in the channels highlighted by the Reviewer is likely due to the fact that there is no positive non-noise signal there.
Morphologically, the distributed signal in the CD20 channel, for example, is representative of a random noise-field, without actual cell-demarked signals.
Since there are very few cells positive for CD31 and CD20 in the image chosen for Fig  1f, please refer to the images in Supp Fig 19 and Supp Fig 20 (Pages 40 and 41 -Fig 2h is showing that CD20+ gated cells express some amount of MPO. Is this a biological truth or is it an artifact of the image segmentation? Same with CD3+ and CD20+ gated populations having some expression of CD11b

R2.3:
This is an important question. The expression of CD11b, CD15, CD71, CD235a and MPO on CD20+ and CD3+ gated cells are caused by multiple reasons. Firstly, 1) Technical imperfections due to sectioning, staining, and segmentation can cause this issue. Tissue sections are 4-6 microns in thickness, and since the full thickness of the section is ablated in IMC, adjacent cells from above or below within the section will appear as overlapping staining patterns. Since MPO and CD11b are bright markers of the most common cell populations, these are most likely to overlap. Such false segmentation may cause the false positive signal in CD3+ and CD20+ cells. Secondly,2) noise present from the image formation process can contribute to the issue of marker signal presence. We have analyzed this condition mathematically in Supplementary Note 1.2.2 from Page 4 to 5 of the Supplementary Materials. Briefly, the hot pixel noise and the shot noise are likely to generate false positive or false negative single cell data, and further, the shot noise will affect low intensity regions more due to the characteristics of shot noise.
In this paper, we focus on how denoising algorithms can address the second class of artifacts. The first artifact could be mitigated by better segmentation algorithms and sectioning/staining procedures. This is the reason why even after denoising, there are still false positive existing in CD3+ and CD20+ cells. We have added this clarification from Line 282 to 284 of the article file (redline version).

R2.4:
Are there any flaws in the data analysis, interpretation and conclusions? -Do these prohibit publication or require revision? -With the disclosure that I am not an expert on computer algorithms and bioinformatic analysis, it does seem that the data analysis and interpretation support the conclusion, with the caveats stated above Is the methodology sound? Does the work meet the expected standards in your field? -Yes, the initial segmentation analysis is based on a well-established pipeline published by the Bodenmiller group Is there enough detail provided in the methods for the work to be reproduced? -Yes We thank the Reviewer for the positive assessments. We have also included Supplementary and external (GitHub) data and pipeline information in order to have this work reproduced on these and other data sets. R.2.5: Additional comments/questions: Authors are encouraged to discuss the questions posed below: 3. -What is the maximum amount of "noise" before a channel is deemed unable to denoise? Is there an empirical way to determine this?
This is an interesting question. In theory, there is no maximum noise level preset for denoising algorithms. Even under some extremely noisy conditions (CD20 and CD3 in Fig. 1f R2.6: -Have analytes that have low signal-to-noise ratio been tested? Yes, we have tested low SNR images. A major motivation for undertaking of the work in this manuscript is due to the fact that in highly complex tissues like the (healthy and diseased) marrow and many tumor tissues are difficult to assess because of low SNR image data in many channels of interest. We have pursued IMC as a method because other techniques, namely IF, fail in some of these difficult to image tissues in multiplexed workflows. This may be due to the need for repeated wash steps, and the overpowering autofluorescent background. In the presented IMC work, please refer to CD20 and CD3 channels in Fig. 1f

R2
.7: -How would the pipeline adapt to analytes that should visualize targets localized inbetween nucleated cells (ie. synapses)? Denoising these channels often results in reduction of image intensity in areas where a signal is expected.
Small areas of staining at the size of a sub-cellular synapse (e.g. 1-2 microns in diameter) will not be successfully distinguished by IMC due to its relatively low resolution of 1 micron, according to Nyquist sampling theory. Therefore, it is likely that a deep learning network cannot learn the features of such small structures. Instead, we and others have tested markers with interstitial staining patterns (e.g. vessels, fibrosis, reticular cytoplasmic projections). As we demonstrate in this work, these can be effectively restored using our hot pixel removal and denoising pipeline (for example: CD31,CD34 and Collagen III in Figs. 1,2 and Supplementary Figs. 19,20 (Pages 40 and 41 of the Supplementary Materials)). In particular, the data in Fig.  2b-c have demonstrated that our denoising algorithm achieves higher signal segmentation accuracy than raw images. We have added this discussion on Line 449 to 454 of the article file (redline version).
R2.8: -How would the software handle instances where cell signal is unexpected; ie if a novel cell type is discovered, how would they distinguish between true positive and false positive?
The cell types in multiplexed images are determined by multiple single marker expressions. Therefore, the answer is determined by the data in each single marker channel. Compared to raw inputs, IMC-Denoise robustly improves image quality, removes background signal and maintains true signals across individual single marker images (Fig. 2b,c and Supplementary Figs. 23,28b,29b,30b (Pages 44,(49)(50)(51) of the Supplementary Materials)). At the same time, IMC-Denoise improves single cell data while maintaining the original scale and linearity  from Page 52 to 54 on the Supplementary Materials). Therefore, IMC-Denoise is capable of assisting in the delineation and study of new cell types by enhancing the quality of each single marker data. We submit that complementary, or orthogonal, methods would be needed to confirm such findings by RNAseq, proteomic analyses or other methods.

R2.9:
Has this been tested on tissues where there is a mix of morphologically heterogenous cells? (ie brain tissue where there are a mix of glial cells, immune cells, and fibroblasts).
We have performed several experiments to verify that the algorithm can work on the tissues with morphologically heterogenous cells. 1) We trained our network with multiple markers (Supplementary Fig. 45 (Fig. 1f) validates the adaptability of DeepSNiF as well. In fact, the networks can learn all the features existing in the training images. Therefore, no matter what the cell shapes are, the network can restore the images if all the essential features existed in the training set. We have added this discussion on Line 445 to 450 of the article file (redline version). R2.10: What happens when there are processes extending from glial cells that are present in the section but their corresponding nuclei are not in the same plane?
The DeepSNiF training and denoising algorithms are not based on nuclear signal or morphology. Although the glial cells are too rare in the tissue types included here (bone marrow, breast, and pancreatic) tissues, we have successfully implemented this algorithm for other cell types where the nucleus is not present within the tissue section (endothelial cells, macrophage projections) or for fine interstitial structures (collagen III). We found that the network works on such structures (CD31,CD34 and Collagen III in Figs. 1,2 and Supplementary Figs. 19,20 (Pages 40 and 41 of the Supplementary Methods)). We have concluded that the networks are able to learn all the features existing in the training set but not focus on any single specific structures. We have added this discussion on Line 449 to 452 of the article file (redline version).

Reviewer #3:
This study develops IMC-Denoise, a denoising automated pipeline to enhance Imaging Mass Cytometry (IMC) images. Two primary sources of noise are here addressed: hot pixels and shot noise. To remove hot pixels, IMC implements a differential intensity map-based restoration (DIMR); to filter shot noise, it implements a deep learning method, DeepSNF. The approach presented here is of intrinsic interest as it develops some new algorithms and approaches that are illuminating and useful. The extensive benchmarking and comparisons with existing denoising methods are strengths of the study. The end-to-end analysis, including the impact on automated cell phenotyping, is important and further strengthens the study. Some concerns that should be addressed to improve the interpretability and, hopefully, the utility of the study are listed below.
R3.1: How many iterations of DIMR does it take to adequately remove the hot pixels in data? Some quantification of the variability would be very helpful. (The paper indicates n = 4 and iteration number is set at 3.) How computationally intensive is the approach?
We thank the reviewer for helping us clarify this point for the readers. To address this question, we have utilized simulated data to test the iteration times and the running time of